A multivariate regression approach to association analysis of a quantitative trait network

نویسندگان

  • Seyoung Kim
  • Kyung-Ah Sohn
  • Eric P. Xing
چکیده

MOTIVATION Many complex disease syndromes such as asthma consist of a large number of highly related, rather than independent, clinical phenotypes, raising a new technical challenge in identifying genetic variations associated simultaneously with correlated traits. Although a causal genetic variation may influence a group of highly correlated traits jointly, most of the previous association analyses considered each phenotype separately, or combined results from a set of single-phenotype analyses. RESULTS We propose a new statistical framework called graph-guided fused lasso to address this issue in a principled way. Our approach represents the dependency structure among the quantitative traits explicitly as a network, and leverages this trait network to encode structured regularizations in a multivariate regression model over the genotypes and traits, so that the genetic markers that jointly influence subgroups of highly correlated traits can be detected with high sensitivity and specificity. While most of the traditional methods examined each phenotype independently, our approach analyzes all of the traits jointly in a single statistical method to discover the genetic markers that perturb a subset of correlated traits jointly rather than a single trait. Using simulated datasets based on the HapMap consortium data and an asthma dataset, we compare the performance of our method with the single-marker analysis, and other sparse regression methods that do not use any structural information in the traits. Our results show that there is a significant advantage in detecting the true causal single nucleotide polymorphisms when we incorporate the correlation pattern in traits using our proposed methods. AVAILABILITY Software for GFlasso is available at http://www.sailing.cs.cmu.edu/gflasso.html.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multivariate Regression Approach to Association Analysis of Quantitative Trait Network

Many complex disease syndromes such as asthma consist of a large number of highly related, rather than independent, clinical phenotypes, raising a new technical challenge in identifying genetic variations associated simultaneously with correlated traits. In this study, we propose a new statistical framework called graph-guided fused lasso (GFlasso) to address this issue in a principled way. Our...

متن کامل

Association mapping of blood pressure levels in a longitudinal framework using binomial regression

Heritable quantitative characters underline complex genetic traits. However, a single quantitative phenotype may not be a suitably good surrogate for a clinical end point trait. It may be more optimal to use a multivariate phenotype vector correlated with the end point trait to carry out an association analysis. Existing methods, such as variance components and principal components, suffer from...

متن کامل

Expression quantitative trait loci mapping with multivariate sparse partial least squares regression.

Expression quantitative trait loci (eQTL) mapping concerns finding genomic variation to elucidate variation of expression traits. This problem poses significant challenges due to high dimensionality of both the gene expression and the genomic marker data. We propose a multivariate response regression approach with simultaneous variable selection and dimension reduction for the eQTL mapping prob...

متن کامل

Statistical Estimation of Correlated Genome Associations to a Quantitative Trait Network

Many complex disease syndromes, such as asthma, consist of a large number of highly related, rather than independent, clinical or molecular phenotypes. This raises a new technical challenge in identifying genetic variations associated simultaneously with correlated traits. In this study, we propose a new statistical framework called graph-guided fused lasso (GFlasso) to directly and effectively...

متن کامل

An efficient approach to large-scale genotype-phenotype association analyses

Modern molecular biotechnology generates a great deal of intermediate information, such as transcriptional and metabolic products in bridging DNA and complex traits. In genome-wide linkage analysis and genome-wide association study, regression analysis for large-scale correlated phenotypes is applied to map genes for those by-products that are regarded as quantitative traits. For a single trait...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 25  شماره 

صفحات  -

تاریخ انتشار 2009